A New Data Sieving Approach for High Performance I/O

نویسندگان

  • Yin Lu
  • Yong Chen
  • Prathamesh Amritkar
  • Rajeev Thakur
  • Yu Zhuang
چکیده

Many scientific computing applications and engineering simulations exhibit noncontiguous I/O access patterns. Data sieving is an important technique to improve the performance of noncontiguous I/O accesses by combining small and noncontiguous requests into a large and contiguous request. It has been proven effective even though more data is potentially accessed than demanded. In this study, we propose a new data sieving approach namely Performance Model Directed Data Sieving, or PMD data sieving in short. It improves the existing data sieving approach from two aspects: (1) dynamically determines when it is beneficial to perform data sieving; and (2) dynamically determines how to perform data sieving if beneficial. It improves the performance of the existing data sieving approach and reduces the memory consumption as verified by experimental results. Given the importance of supporting noncontiguous accesses effectively and reducing the memory pressure in a large-scale system, the proposed PMD data sieving approach in this research holds a promise and will have an impact on high performance I/O systems. Y. Lu Y. Chen (&) P. Amritkar Y. Zhuang Computer Science Department, Texas Tech University, Lubbock, TX, USA e-mail: [email protected] Y. Lu e-mail: [email protected] P. Amritkar e-mail: [email protected] Y. Zhuang e-mail: [email protected] R. Thakur Mathematics and Computer Science Division, Argonne National Lab, Argonne, IL, USA e-mail: [email protected] James J. (Jong Hyuk) Park et al. (eds.), Future Information Technology, Application, and Service, Lecture Notes in Electrical Engineering 164, DOI: 10.1007/978-94-007-4516-2_12, Springer Science+Business Media Dortdrecht 2012 111

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Supporting Efficient Noncontiguous Access in PVFS over InfiniBand

Noncontiguous I/O access is the main access pattern in many scientific applications. Noncontiguity exists both in access to files and in access to target memory regions on the client. This characteristic imposes a requirement of native noncontiguous I/O access support in cluster file systems for high performance. In this paper, we address two main issues on supporting efficient noncontiguous I/...

متن کامل

Data Sieving and Collective I/O in ROMIO

The I/O access patterns of parallel programs often consist of accesses to a large number of small, noncontiguous pieces of data. If an application’s I/O needs are met by making many small, distinct I/O requests, however, the I/O performance degrades drastically. To avoid this problem, MPI-IO allows users to access a noncontiguous data set with a single I/O function call. This feature provides M...

متن کامل

I/O Optimization and Evaluation for Tertiary Storage Systems

Large-scale parallel scientific applications are generating huge amounts of data that tertiary storage systems emerge as a popular place to hold them. SRB, a uniform interface to various storage systems including tertiary storage systems such as HPSS, UniTree etc., becomes an important and convenient way to access tertiary data across networks in a distributed environment. But SRB is not optimi...

متن کامل

Extended collective I/O for efficient retrieval of large objects

Object-relational databases management systems (ORDBMS) extend the capabilities of the relational databases by allowing definition of new data types and methods to operate on these data types while retaining most of the relational model semantics. In this paper, we examine issues related to parallel processing of queries in object-relational model with respect to efficient storage and retrieval...

متن کامل

Predicting Performance of Non-contiguous I/O with Machine Learning

Data-sieving in ROMIO promises to optimize individual non-contiguous I/O. However, making the right choice and parameterize its buffer size are non-trivial, since performance prediction is difficult. Moreover, since many performance factors are not taken into account by datasieving, the optimization towards access pattern and system is often not possible. In this work, we 1) discuss limitations...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012